-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consul enable tag override master #5774
Consul enable tag override master #5774
Conversation
In Nomad 0.9, we made volume driver handling the same for `""`, and `"local"` volumes. Prior to Nomad 0.9 however these had slightly different behaviour for relative paths and named volumes. Prior to 0.9 the empty string would expand relative paths within the task dir, and `"local"` volumes that are not absolute paths would be treated as docker named volumes. This commit reverts to the previous behaviour as follows: | Nomad Version | Driver | Volume Spec | Behaviour | |------------------------------------------------------------------------- | all | "" | testing:/testing | allocdir/testing | | 0.8.7 | "local" | testing:/testing | "testing" as named volume | | 0.9.0 | "local" | testing:/testing | allocdir/testing | | 0.9.1 | "local" | testing:/testing | "testing" as named volume |
Here we retain 0.8.7 behavior of waiting for driver fingerprints before registering a node, with some timeout. This is needed for system jobs, as system job scheduling for node occur at node registration, and the race might mean that a system job may not get placed on the node because of missing drivers. The timeout isn't strictly necessary, but raising it to 1 minute as it's closer to indefinitely blocked than 1 second. We need to keep the value high enough to capture as much drivers/devices, but low enough that doesn't risk blocking too long due to misbehaving plugin. Fixes hashicorp#5579
I noticed that `watchNodeUpdates()` almost immediately after `registerAndHeartbeat()` calls `retryRegisterNode()`, well after 5 seconds. This call is unnecessary and made debugging a bit harder. So here, we ensure that we only re-register node for new node events, not for initial registration.
Noticed that `detected drivers` log line was misleading - when a driver doesn't fingerprint before timeout, their health status is empty string `""` which we would mark as detected. Now, we log all drivers along with their state to ease driver fingerprint debugging.
Currently, when logmon fails to reattach, we will retry reattachment to the same pid until the task restart specification is exhausted. Because we cannot clear hook state during error conditions, it is not possible for us to signal to a future restart that it _shouldn't_ attempt to reattach to the plugin. Here we revert to explicitly detecting reattachment seperately from a launch of a new logmon, so we can recover from scenarios where a logmon plugin has failed. This is a net improvement over the current hard failure situation, as it means in the most common case (the pid has gone away), we can recover. Other reattachment failure modes where the plugin may still be running could potentially cause a duplicate process, or a subsequent failure to launch a new plugin. If there was a duplicate process, it could potentially cause duplicate logging. This is better than a production workload outage. If there was a subsequent failure to launch a new plugin, it would fail in the same (retry until restarts are exhausted) as the current failure mode.
Co-Authored-By: notnoop <mahmood@notnoop.com>
…cation instead of (unupdated) server copy
This reverts commit c97e4c6.
…ng left brackets" This reverts commit 7ea3f38.
Fixes hashicorp#5593 Executor seems to die unexpectedly after nomad agent dies or is restarted. The crash seems to occur at the first log message after the nomad agent dies. To ease debugging we forward executor log messages to executor.log as well as to Stderr. `go-plugin` sets up plugins with Stderr pointing to a pipe being read by plugin client, the nomad agent in our case[1]. When the nomad agent dies, the pipe is closed, and any subsequent executor logs fail with ErrClosedPipe and SIGPIPE signal. SIGPIPE results into executor process dying. I considered adding a handler to ignore SIGPIPE, but hc-log library currently panics when logging write operation fails[2] This we opt to revert to v0.8 behavior of exclusively writing logs to executor.log, while we investigate alternative options. [1] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-plugin/client.go#L528-L535 [2] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-hclog/int.go#L320-L323
driver/docker: collect tty container logs
…-errs-2 logmon: recover from shutting down call locally
…omad into consul_enable_tag_override_master # Conflicts: # CHANGELOG.md # client/client.go # command/agent/bindata_assetfs.go # drivers/docker/docklog/docker_logger.go # drivers/docker/driver.go # version/version.go
Thank you for your submission! We require that all contributors sign our Contributor License Agreement ("CLA") before we can accept the contribution. Read and sign the agreement Learn more about why HashiCorp requires a CLA and what the CLA includes 5 out of 6 committers have signed the CLA.
Nomad Release bot seems not to be a GitHub user. Have you signed the CLA already but the status is still pending? Recheck it. |
Created a different PR, with a cleaner commit history #5775 |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
At the moment the Consul's EnableTagOverride flag not being set by Nomad, so the default(false) is always used. In certain cases it is useful for services outside Nomad to update the tags of the Consul services, however for that to happen EnableTagOverride has to be set to true.
This change enables setting EnableTagOverride at a service level.
An older request describing this can be found here #2057